Storage system

ABSTRACT

A storage system includes a plurality of data disks that store information, and a parity disk that corresponds to a disk group including some of the plurality of data disks and stores parity information generated on the basis of data of the data disks included in the corresponding disk group. Any of the data disks is included in a plurality of the disk groups.

CROSS-REFERENCE TO RELATED APPLICATION

This application is based upon and claims the benefit of priority of theprior Japanese Patent Application No. 2013-218706, filed on Oct. 21,2013, the entire contents of which are incorporated herein by reference.

FIELD

The embodiments discussed herein are related to a technique forimproving reliability of a distributed storage.

BACKGROUND

In a field of cloud storage, a replication technique for avoiding a dataloss caused by a disk fault or a block fault by replicating data iswidespread. In recent years, an erasure code technique that furtherimproves reliability and capacity efficiency by efficiently encodingdata to provide minimum redundancy has been actively researched anddeveloped.

However, RAID, a representative method of the erasure code technique,needs to calculate parities from all disks to be made redundant, posinga problem such that a data transfer amount between nodes increases in adistributed storage and a network is bottlenecked.

Accordingly, a technique that reduces a data transfer amount in anetwork when a single disk is recovered by calculating some of theparities from not all but some of the disks in a cloud storage ispresented.

Note that techniques described in the following documents are known.

Japanese Laid-open Patent Publication No. 2005-44182

Japanese Laid-open Patent Publication No. 2007-257630

Japanese Laid-open Patent Publication No. 07-200187

SUMMARY

According to an aspect of the embodiment, a storage system includes aplurality of data disks that store information, and a parity disk thatcorresponds to a disk group including some of the plurality of datadisks, and stores parity information generated on the basis of data ofthe data disks included in the corresponding disk group. Any of the datadisks is included in a plurality of the disk groups.

The object and advantages of the invention will be realized and attainedby means of the elements and combinations particularly pointed out inthe claims.

It is to be understood that both the foregoing general description andthe following detailed description are exemplary and explanatory and arenot restrictive of the invention.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a functional block diagram illustrating a configuration of animplementation example of a storage system.

FIG. 2 illustrates an example of a configuration of an informationprocessing system.

FIG. 3 illustrates an example of a configuration of a management server.

FIG. 4 illustrates an example of calculation ranges of local parities ina first embodiment.

FIG. 5 is a flowchart illustrating a process in the first embodimentwhich is executed when a disk fault occurs.

FIG. 6 illustrates an example of a hardware configuration of themanagement server.

FIG. 7 illustrates an example of calculation ranges of local parities ina second embodiment.

FIG. 8 illustrates an example of calculation ranges of local parities ina third embodiment.

FIG. 9 illustrates an example of calculation ranges of local parities ina fourth embodiment.

FIG. 10 is a first half of a flowchart illustrating a process in a fifthembodiment which is executed when a disk fault occurs.

FIG. 11 is a latter half of the flowchart illustrating the process inthe fifth embodiment which is executed when the disk fault occurs.

FIG. 12 illustrates an example of a data configuration in a comparisonexample 1.

FIG. 13 illustrates an example of a data configuration in a comparisonexample 2.

FIG. 14 illustrates an example of a performance comparison between thefifth embodiment and the comparison examples.

DESCRIPTION OF EMBODIMENTS

For example, in a system where some of the parities are arranged bybeing calculated from some of the disks, the parities are in some casesnot suitably arranged to handle a multiple fault in the disks. In thiscase, it is needed to recover a disk by using a global parity (a paritycalculated by using data of all disks) when a multiple fault occurs inthe disks. Moreover, in this case, it is necessary to transfer data froman equal number of disks as that of data disks to nodes thatrespectively manage faulty disks.

According to an embodiment, a storage system that includes parity disksoptimally arranged to recover a faulty disk can be provided.

FIG. 1 is a functional block diagram illustrating a configuration of animplementation example of a storage system. In FIG. 1, the storagesystem 1 includes data disks 2, parity disks 3, and a recovering unit 4.

The data disks 2 store information. The parity disks 3 correspond to adisk group including some of the plurality of data disks 2, and storeparity information generated on the basis of data of data disks 2included in a corresponding disk group. Moreover, any of the data disks2 is included in a plurality of disk groups.

Each of the plurality of data disks 2 is included in any of the diskgroups. Moreover, an equal number of data disks 2 is included in each ofthe disk groups.

The plurality of parity disks 3 are classified into a plurality ofparity groups so that an arbitrary pair of parity disks 3 within each ofthe parity groups does not include the same data disk 2 in acorresponding disk group. Moreover, each of the data disks 2 is includedin any of the disk groups corresponding to the parity disks 3 includedin each of the parity groups.

Identification numbers are made to correspond respectively in ascendingorder to a storage order of data stripes of the data disks 2, and aminimum value of the identification numbers of the data disks 2 includedin the disk group that corresponds to each of the parity disks 3 isshifted by a number calculated by dividing the number of data disks bythe number of parity disks.

When a fault occurs in one or more data disks 2, the recovering unit 4recovers data of each faulty data disk 2 by using a parity disk 3corresponding to a disk group including the faulty data disk 2.Moreover, the recovering unit 4 selects a number of parity disks whichis equal to that of the faulty data disks 2 so that all the faulty datadisks 2 are included in disk groups corresponding to any of the paritydisks 3 to be selected and the one or more faulty data disks 2 areincluded in the disk groups corresponding to all the parity disks 3 tobe selected, and the recovering unit 4 recovers the data of the faultydata disks on the basis of the selected parity disks 3. Additionally,the recovering unit 4 selects a number of parity disks 3 which is equalto that of the faulty data disks 2 so that all the faulty data disks 2are included in the disk groups of any of the parity disks 3 to beselected, the one or more faulty data disks 2 are included in the diskgroups corresponding to all the parity disks 3 to be selected, and a sumof the sets of data disks 2 included in the disk groups corresponding tothe parity disks 3 to be selected is minimized, and the recovering unit4 recovers the data of the faulty data disks 2 on the basis of theselected parity disks 3.

First Embodiment

FIG. 2 illustrates an example of a configuration of an informationprocessing system according to a first embodiment. The informationprocessing system includes a client server 21, and a plurality ofmanagement servers 22 (22 a to 22 c). The client server 21 is connectedto the plurality of management servers 22 via a network or a bus. Toeach of the plurality of management servers 22, a plurality of disks 23(23 a to 23 l) are respectively connected. Data input and output to andfrom the disks are performed respectively by the management servers 22.

In the information processing system, a disk array including some of aplurality of disks is configured. For example, a disk array includingthe disk 23 a, the disk 23 e and the disk 23 i is considered to be onedisk array. Moreover, for example, a disk array including the disk 23 a,the disk 23 b and the disk 23 c is considered to be another disk array.

The client server 21 accepts data from a user terminal, divides theaccepted data, and transmits the divided data to the management servers22 managing disks that configure a disk array.

Each of the management servers 22 receives data from the client server21, and stores the received data on disks managed by the localmanagement server 22.

Here, the data stored in a disk array is arranged sequentially by aspecified size on different physical disks within the array. A datasetarranged on each of the disks by the specified size is referred to as astlip. A dataset including a plurality of stlips is referred to as astripe.

A storage space of a disk array includes a plurality of stripes. A diskarray includes data disks and parity disks. The data disks are disks onwhich data from a user who uses the disk array is stored. The paritydisks are disks on which parity information used to recover data of adata disk when a fault occurs in the disk is stored.

Each of the parity disks is made to correspond to a plurality of datadisks, and parity information stored on the parity disk is calculated onthe basis of data of the plurality of data disks made to correspond tothe parity disk. The plurality of data disks made to correspond to theparity disk are sometimes referred to as data disks included in acalculation range of a parity disk in the following description. Parityinformation may be, specifically, a Reed-Solomon code, or informationfor correcting various types of errors.

In the first embodiment, data disks included in a calculation range ofeach parity disk are assumed to be not all but some of the data disks ofa disk array. Here, a parity disk including some of the data disks as acalculation range is sometimes referred to as a local parity in thefollowing description. Moreover, all local parities have mutuallydifferent calculation ranges. Additionally, each of the local paritiesis assumed to include at least one data disk common to a data diskincluded in a calculation range of at least one other local parity.Furthermore, each of the data disks included in a disk array is assumedto be able to be included in a calculation range of any of the localparities. Note that data disks included in a calculation range of eachparity disk may be a plurality of consecutive data disks. When datadisks are consecutive, this indicates a relationship between data disks,such that data is written as initial data of one stlip next to datawritten to the end of another stlip.

Each of the plurality of management servers 22 includes a control unit31, a recovering unit 32, a parity generating unit 33, and a storageunit 34. The control unit 31 reads and writes data of a disk in responseto a request issued from the client server 21 or a different managementserver 22, and returns a result of the read or the write to the requestsource. When a fault occurs in a data disk managed by the managementserver 22, the recovering unit 32 recovers data of the faulty data diskby using data of a local parity. The parity generating unit 33calculates parity information by using data stored on data disksincluded in a calculation range of each local parity, and stores thecalculated parity information in the local parity. The storage unit 34makes an association between identification information of each localparity included in a disk array and identification information of datadisks included in the calculation range of the local parity, and storesinformation which include the association as association information.

The following description refers to a case of k=10 and n=6 on the basisof the assumption that the number of all disks included in a disk arrayis n, the number of data disks is k, and the number of disks of a localparity is p=n−k.

FIG. 4 illustrates an example of calculation ranges of local parities.FIG. 4 illustrates an example of a disk array including ten data disksand six local parities. The data disks and the local parities areindicated by D1 to D10 and L1 to L6, respectively.

The local parity L1 includes the five data disks D1 to D5 as acalculation range. The local parities L2, L3, L4, L5 and L6 respectivelyinclude the data disks D2 to D6, D4 to D7, D5 to D10, D8 to D2, and D9to D3 as a calculation range. Here, calculation ranges of all the localparities L1 to L6 are not the same, and the local parities respectivelyinclude at least one data disk common to at least one different localparity as a calculation range. Namely, for example, L1 includes, as acalculation range, the data disks D2 to D5 common to L2, and alsoincludes the data disks D1 to D3 common to L6.

A case where a data disk is recovered when a fault occurs in the datadisk is described next. When a fault occurs in a disk, the managementserver 22 that manages the faulty disk initially receives, from anotherdevice (such as a client server), a notification that the fault hasoccurred in the disk. Then, a standby disk intended to store dataobtained by recovering contents of the faulty disk as a replacement forthe faulty disk is allocated to the control unit 31 of the managementserver 22.

Next, the recovering unit 32 of the management server 22 selects a localparity used to recover the faulty disk from among a plurality of localparities included in a disk array, and recovers the data of the faultydata disk by using data of the selected local parity.

Specifically, the recovering unit 32 initially selects a combination oflocal parities, which satisfies the following three conditions(hereinafter referred to as selection conditions), as local parities tobe used for the recovery. The first selection condition is that thenumber of elements of the combination of local parities be equal to thatof faulty data disks. For example, when the number of faulty data disksis three, three local parities are selected as local parities includedin the combination. The second selection condition is that each of thelocal parities included in the combination include at least one faultydata disk in a calculation range. Namely, a local parity that includesnone of the faulty data disks in a calculation range is not selected asan element of the combination. The third selection condition is thateach of the faulty data disks be able to be included in a calculationrange of any of the local parities included in the combination.

A selection of a combination of local parities used for a recovery isdescribed with reference to the example illustrated in FIG. 4. A casewhere a fault occurs in three data disks D2, D5 and D8 in the exampleillustrated in FIG. 4 is considered. A combination of local paritiesused for the recovery satisfies the selection conditions. In the exampleillustrated in FIG. 4, the number of faulty data disks is three.Therefore, three local parities are selected as data disks to be usedfor the recovery. Since the selected local parities include at least oneof the faulty data disks as a calculation range, any of L1 to L6, whichare local parities including at least one of D2, D5 and d8 as acalculation range, is selected. Moreover, a combination of localparities which satisfies the condition that each fault disk be able tobe included in a calculation range of any of the calculation ranges ofthe three selected local parities is selected. Combinations of localparities, which include each of D2, D5 an dD8 as a calculation rage, aredecided, for example, as follows.

In order for each of D2, D5 and D8 to be included in any of thecalculation ranges of the three local parities, to avoid duplicationlocal parities may be selected one at a time from a set of localparities each including D2, D5 and D8 as a calculation range. Namely, inFIG. 4, a set of local parities including D2 as a calculation range is{L1, L2, L5, L6}. A set of local parities including D5 as a calculationrange is {L1, L2, L3, L4}. A set of local parities including D8 as acalculation range is {L4, L5}. At this time, to avoid duplication thelocal parities may be selected one at a time from the set of localparities each including D2, D5 an dD8 as a calculation range.Accordingly, the combination of local parities is any of (L1, L2, L4),(L1, L2, L5) (L1, L3, L4), (L1, L3, L5), (L1, L4, L5), (L2, L3, L4),(L2, L3, L5), (L2, L4, L5), (L5, L3, L4), (L6, L1, L4), (L6, L1, L5),(L6, L2, L4), (L6, L2, L5), (L6, L3, L4), (L6, L3, L5) and (L6, L4, L5).The recovering unit 32 selects any of these combinations as thecombination of local parities used for the recovery.

After the recovering unit 32 selects any of the combinations of localparities which satisfy the selection conditions, the recovering unit 32recovers the data of the faulty data disks by solving simultaneouslinear equations with the use of a parity calculation expression byusing the selected combination of local parities.

Specifically, after the recovering unit 32 initially selects thecombination of local parities used for the recovery, the recovering unit32 obtains data stored in the local parities of the selectedcombination, and data stored in data disks included in calculationranges of the local parities of the selected combination. Note that,however, the faulty data disks are not regarded as targets to beobtained. For instance, in the example illustrated in FIG. 4, when (L1,L2, L5) is selected as the combination of local parities, the recoveringunit 32 initially obtains data stored on the local parities L1, L2 andL5. At the same time, the recovering unit 32 obtains data stored on D1,D3, D4, D6, D9 and D10, which are data disks that are included in thecalculation ranges of L1, L2 and L5 and are not faulty. The recoveringunit 32 that has selected the combination may acquire data of a targetto be obtained, for example by transmitting a request to acquire thedata of the target to be obtained to the management server 22 thatmanages the disk on which the data to be obtained is stored, and byreceiving a reply to the request to acquire the data of the target to beobtained. Moreover, the recovering unit 32 can identify a data diskincluded in the calculation range of the local parity by referencingassociation information stored in the storage unit 34.

The recovering unit 32 recovers the data of D2, D5 and D8 by solvingsimultaneous linear equations with three unknowns with the use of theparity calculation expression by using the obtained data of the localparities, and the data of the data disks included in the calculationranges of the local parities.

Then, the recovering unit 32 stores the recovered data of D2, D5 and D8on standby disks that are newly allocated to the respective data disksD2, D5 and D8, and completes the recovery.

When p=6, namely, when six local parities are present as illustrated inthe example of FIG. 4, a maximum of six data disks can be recovered bysolving linear equations with six unknowns with the use of the paritycalculation expression of the six local parities.

A solution to linear equations with multiple unknowns can be obtained,for example by using various methods such as a discharge calculation orthe like.

A recovery in a case where faulty disks include data disks and a localparity is described next. Assume that the faulty local parity includes afaulty data disk in a calculation range. When a fault occurs in a localparity that does not include a faulty data disk in a calculation range,the recovering unit 32 recreates the local parity by using data ofnormal data disks.

When faulty disks include data disks and a local parity, data of thefaulty local parity cannot be used to recover data of the data disks.Therefore, the recovering unit 32 recovers the data of the data disks byusing a normal local parity. Thereafter, the recovering unit 32 recovers(recreates) the data of the faulty local parity by using the recovereddata of the data disks. For instance, when a fault occurs in L1, D5 andD8 in the example illustrated in FIG. 4, the recovering unit 32initially recovers D5 and D8 by using a normal local parity. Thereafter,the recovering unit 32 recovers (recreates) parity information of L1 byusing the recovered data of D5 and data of D1 to D4.

A recovery method used when faulty disks include data disks and a localparity is the same as that used to recover a data disk when a faultoccurs in the data disk except that a faulty local parity is excluded asa local parity used for the recovery.

Specifically, the recovering unit 32 initially selects a combination oflocal parities under a condition that a fault does not occur in a localparity to be selected in addition to the above described three selectionconditions in the selection of local parities used for the recovery.

The number of faulty data disks is two when a fault occurs in D5 and D8in the example illustrated in FIG. 4. Therefore, two local parities areselected as data disks used for the recovery. Local parities to beselected are not faulty local parities and respectively include at leastone faulty data disk as a calculation range. Therefore, any two of L2 toL5 are selected. Moreover, a combination of local parities whichsatisfies the condition that each faulty data disk be included in acalculation range of either of the two selected local parities isselected. Such a combination is any of (L2, L4), (L2, L5), (L3, L4),(L3, L5) and (L4, L5). The recovering unit 32 selects any of thesecombinations as a combination of local parities used for the recovery.

Then, the recovering unit 32 obtains data of the local parities of theselected combination, and data of data disks that are data disksincluded in calculation ranges of the local parities of the selectedcombination and are not faulty. The recovering unit 32 recovers the dataof the faulty data disks by solving simultaneous linear equations withmultiple unknowns with the use of the parity calculation expression byusing the obtained data.

Upon completion of the recovery of the faulty data disks, the recoveringunit 32 recreates a local parity by using the recovered data of the datadisks, and data of other normal data disks included in the calculationrange.

It has been assumed that a notification from another device is receivedwhen a fault occurs in a disk. However, the control unit 31 of themanagement server 22 that manages a faulty disk may detect a fault inthe disk.

A recovery overhead is described next. In the first embodiment, arecovery overhead is defined as a data transfer amount of data receivedwhen a node that manages a faulty disk recovers the data. Forsimplification of an explanation, it is assumed that a data transferamount of data stored on one disk is counted as 1.

When a single fault occurs, the number of data disks included in acalculation range of a local parity used for the recovery results in arecovery overhead remaining unchanged. In the following description, thenumber of data disks included in a calculation range of a local parityis sometimes referred to as a size of the calculation range of the localparity. For instance, in the example illustrated in FIG. 4, a recoveryoverhead when a single fault occurs is 6 at maximum. Here, a recoveryoverhead when a single fault occurs in L4 is described. The calculationrange of L4 is six data disks D5 to D10. Accordingly, when a faultoccurs in any of the data disks D5 to D10 and the disk is recovered byusing L4, data of a total of six disks including the five data disksthat are not faulty among D5 to D10 and L4, are transferred.Accordingly, the recovery overhead in this case is 6.

When a multiple fault occurs, a total of data of local parities of aselected combination, and data of data disks that are data disksincluded in calculation ranges of the local parities of the selectedcombination and are not faulty results in a recovery overhead. Forexample, when a double fault occurs in D2 and D5, which are recovered byusing data of L1 and L6 in FIG. 4, data of the seven disks D1, D3, D4,D9, D10, L1 and L6 are transferred to the management servers thatrespectively manage D2 and D5. Accordingly, a recovery overhead in thiscase is 7. In the case of the configuration illustrated in FIG. 4, arecovery overhead when a double fault occurs in data disks is 6 or 7(6.36 on average).

An overhead evaluation when an initial parity is generated is describednext. In the first embodiment, an overhead when an initial parity isgenerated (initialization overhead) is defined as a data transfer amountof data received by the management server 22 of a local parity whenparity information stored in the local parity is generated.

When an initial parity is generated, the parity generating unit 33 ofthe management server 22 of each local parity generates a parity byobtaining data of all data disks included in a calculation range of thelocal parity. Accordingly, an overhead when the initial parity isgenerated results in a data amount stored on a data disk in acalculation range of the local parity.

For instance, in the example illustrated in FIG. 4, a multi-to-multidata transfer is performed in parallel in such a way that data of D1 toD5 and those of D2 to D6 are respectively transferred to the managementservers 22 of L1 and L2 when an initial parity is generated. Note that adata read from a data disk is performed only once by each of themanagement servers 22 (data read once is cached in a memory andtransferred to a plurality of nodes).

In the example illustrated in FIG. 4, a local parity having a maximumcalculation range is an L4 having a calculation range of 6. Therefore,when the initial parity is generated, data of six data disks aretransferred at maximum. Accordingly, an initialization overhead in thesystem illustrated in FIG. 4 is 6.

For example, the initialization overhead in the first embodiment can bereduced in comparison with an initialization overhead in a disk arraydevice having a global parity including all data disks as a calculationrange. For instance, in a disk array device having a global parityincluding 10 data disks, the initialization overhead is 10. In themeantime, for instance, in the example illustrated in FIG. 4, theinitialization overhead is 6. In this way, the initialization overheadcan be reduced in the first embodiment.

An overhead evaluation when a new data disk is added is described next.In the first embodiment, an overhead when a new data disk is added isdefined as the number of local parities for which parity information isrecalculated when the new data disk is added to a disk array.

When a new data disk is added, parity information is recalculated in alocal parity the calculation range of which is changed to include thenew data disk.

For instance, a case where a new disk D′1 is added between D5 and D6 inFIG. 4 is considered. In this case, D′5 is added to the calculationranges of L1, L2, L3 and L4. Therefore, the sizes of the calculationranges of L1 to L4 increase by 1. The parity generating unit 33 of themanagement servers 22 that respectively manage L1 to L4 in which thecalculation range has been changed obtains data of all the data disksincluded in the changed calculation range, and calculates parityinformation by using the obtained data. Then, the parity generating unit33 stores the calculated parity information in a corresponding localparity.

In the meantime, since the calculation ranges of L5 and L6 remainunchanged, parity information is not recalculated. Accordingly, when thenew disk D′5 is added in FIG. 4, four pieces of parity information of L1to L4 are recalculated, so that an overhead when the new data disk isadded is 4. As described above, in the first embodiment, a local parityin which the calculation range remains unchanged is present when a newdata disk is added. Therefore, an overhead when a new data disk is addedcan be reduced.

FIG. 5 is a flowchart illustrating a process in the first embodimentwhich is executed when a disk fault occurs.

In FIG. 5, the recovering unit 32 initially determines whether or not alocal parity is included in faulty disks (S101). If the local parity isincluded in the faulty disks (“YES” in S101), the recovering unit 32excludes the faulty local parity from a set of local parities used for arecovery (S102). Then, the process proceeds to S103.

If the local parity is not included in the faulty disks in S101 (“NO” inS101), the recovering unit 32 selects one of the combinations of localparities where the number is equal to that of faulty data disks (S103).Here, the combination selected by the recovering unit 32 is assumed tobe a combination of local parities which has not been selected yet inS103.

Next, the recovering unit 32 determines whether or not the combinationselected in S103 is present (S104). If the recovering unit 32 determinesthat the combination is not present (“NO” in S104), the faulty data diskcannot be recovered. Therefore, the process is abnormally terminated.

In the meantime, if the recovering unit 32 determines that thecombination is present (“YES” in S104), the recovering unit 32 furtherdetermines whether or not a local parity that includes none of thefaulty data disks in a calculation range is present (S105). If therecovering unit 32 determines that the local parity that includes noneof the faulty data disks in the calculation range is present (“YES” inS105), the process returns to S103.

In the meantime, if the recovering unit 32 determines that the localparity that includes none of the faulty data disks in the calculationrange is not present in S105 (“NO” in S105), the recovering unit 32further determines whether or not each of the faulty data disks are alsonot included in calculation ranges of the local parities of the selectedcombination (S106). If the recovering unit 32 determines that any one ofthe faulty data disks are not included in the calculation ranges of thelocal parities of the selected combination (“NO” in S106), the processreturns to S103, and a new combination of local parities is selected.

In the meantime, if the recovering unit 32 determines that all thefaulty data disks are included in the calculation ranges of any of thecalculation ranges of the local parities of the selected combination(“YES” in S106), the recovering unit 32 recovers the data disks bysolving simultaneous linear equations (S107). Specifically, therecovering unit 32 obtains data of the local parities of the combinationselected in S103, and those of data disks included in the calculationranges of the local parities of the selected combination. Note that,however, the faulty data disks are not regarded as targets to beobtained. Then, the recovering unit 32 recovers the data of the faultydata disks by solving the simultaneous linear equations with the use ofthe parity calculation expression by using the obtained data of thelocal parities and those of the data disks included in the calculationranges of the local parities.

Then, the process is terminated normally.

A configuration of the management servers 22 is described next. FIG. 6illustrates an example of a hardware configuration of the managementservers 22.

Each of the management servers 22 includes a CPU (Central ProcessingUnit) 401, a memory 402, a storage device 403, a reading device 404 anda communication interface 405. The CPU 401, the memory 402, the storagedevice 403, the reading device 404 and the communication interface 405are interconnected by a bus.

The CPU 401 provides all or some of the functions of the control unit31, the recovering unit 32 and the parity generating unit 33 byexecuting a program that describes the steps of the above describedflowcharts with the use of the memory 402.

The memory 402 is, for example, a semiconductor memory, and isconfigured by including a RAM (Random Access Memory) area and a ROM(Read Only Memory) area. The storage device 403 is, for example, a harddisk. The storage device 403 may be a semiconductor memory such as aflash memory or the like. Alternatively, the storage device 403 may bean external recording device. The storage device 403 provides all orsome of the functions of the storage unit 34.

The reading device 404 accesses an insertable/removable storage medium450 according to an instruction of the CPU 401. The insertable/removablestorage medium 450 is implemented, for example, with a semiconductordevice (USB memory or the like), a medium (such as a magnetic disk orthe like) to/from which information is input/output with a magneticaction, a medium (such as a CD-ROM, a DVD or the like) to/from whichinformation is input/output with an optical action, or the like. Notethat the reading device 404 does not need to always be included in themanagement server 22.

The communication interface 405 transmits and receives data to and fromthe client server 21 and other management servers 22.

A program according to the embodiment is provided by the managementserver 22 in, for example, the following forms.

(1) Preinstalled in the storage device 403.(2) Provided by the insertable/removable storage medium 450.(3) Provided from a program server (not illustrated) via thecommunication interface 405.

Additionally, part of the management server 22 in this embodiment may beimplemented with hardware. Alternatively, the management server 22 inthis embodiment may be implemented by combining software and hardware.

Note that the client server 21 and the management server 22 may beincluded in the same housing, and the functions of the client server 21and the management server 22 may be provided by the same CPU. Moreover,the management server 22 may be a controller module included in thestorage device.

Furthermore, data disks and parity disks have been described by making adistinction between them in this embodiment. However, a physical disk onwhich parity information for each stripe is stored may be modified.Namely, data of one stripe and parity information of another stripe maybe stored on the same physical disk.

In the first embodiment, a data transfer amount in a network can bereduced by using not a global parity but local parities even when aninitial parity is generated or a double (multiple) fault occurs indisks. Accordingly, the length of time needed to recover data can bereduced. Moreover, a data transfer amount needed when consistency ischecked may be reduced for a similar reason.

Additionally, the storage system according to the first embodimentsupports a recovery performed when a fault occurs on a specified numberof disks by making calculation ranges of local parities partiallyoverlap even though only the local parities are employed. Moreover, itis needed to recalculate only some of the local parities when a new diskis added, whereby a data transfer amount can be reduced.

Second Embodiment

A second embodiment is configured so that the numbers of data disksincluded in calculation ranges (sizes of calculation ranges) of localparities become equal. As a result, whichever local parity starts to berecovered when data of a faulty data disk is recovered, a recoveryoverhead always becomes equal. Accordingly, an RTO (Recovery TimeObjective) estimated in the worst case can be minimized. RTO is a targetvalue of time needed from an occurrence of a fault in a disk untilcompletion of a recovery of the disk.

A configuration of an information processing system according to thesecond embodiment is the same as that according to the first embodimentexcept that sizes of calculation ranges of local parities become equal.

FIG. 7 illustrates an example of calculation ranges of local parities inthe second embodiment. FIG. 7 illustrates the example of a disk arrayincluding 10 data disks and six local parities. The data disks and thelocal parities are indicated by D1 to D10 and L1 to L6, respectively.

The local parity L1 includes five data disks D1 to D5 as a calculationrange. Moreover, the local parities L2, L3, L4, L5 and L6 respectivelyinclude data disks D2 to D6, D3 to D7, D6 to D10, D8 to D2 and D9 to D3as a calculation range. Here, none of the calculation ranges of thelocal parities L1 to L6 are the same, in a similar manner as in thefirst embodiment, and all include at least one data disk common to atleast one other local parity as a calculation range.

Unlike the first embodiment, all calculation ranges of all the localparities L1 to L6 include an equal number of data disks, namely, fivedata disks included in their calculation range in the exampleillustrated in FIG. 7.

A recovery method used when a fault occurs in a disk is similar to thatin the first embodiment. However, a recovery overhead becomes equalwhichever combination is selected from among combinations that satisfythe selection conditions in a selection of a combination of localparities used for a recovery.

When a single fault occurs, a size of a calculation range of a localparity used for the recovery results in a recovery overhead remainingunchanged. Accordingly, in the example illustrated in FIG. 7, the sizeof the calculation range of each of the local parities is 5. Therefore,the recovery overhead when a single fault occurs is 5.

When a multiple fault occurs, a total of data of local parities of aselected combination and those of data disks that are included incalculation ranges of the local parities of the selected combination andare not faulty results in a recovery overhead. Accordingly, when adouble fault occurs, for example, in D2 and D5 in FIGS. 7 and D2 and D5are recovered by using data of L1 and L6, data of seven disks D1, D3,D4, D9, D10, L1 and L6 are transferred to the management servers thatrespectively manage D2 and D5.

In the system according to the second embodiment, RTO estimated in theworst case can be minimized in comparison with the first embodiment.

Third Embodiment

In a third embodiment, local parities are classified into a plurality ofgroups so that all data disks respectively included in calculationranges may differ. Moreover, the local parities are classified so thateach of the data disks is included in any of the calculation ranges oflocal parities included in each of the groups.

A configuration of an information processing system according to thethird embodiment is the same as that in the second embodiment exceptthat local parities are classified into groups.

In the third embodiment, all data disks are included in calculationranges of a number of local parities equal to that of the groups, and anumber of faults of disks equal to that of the groups can be recovered.Namely, a minimum Hamming distance increases, improving an evaluation ofreliability.

FIG. 8 illustrates an example of calculation ranges of local parities inthe third embodiment. FIG. 8 illustrates an example of a disk arrayincluding 10 data disks and six local parities. Moreover, the data disksand the local parities are indicated by D1 to D10 and L1 to L6,respectively.

The local parity L1 includes the five data disks D1 to D5 as acalculation range. Moreover, the local parities L2, L3, L4, L5 and L6respectively include D2 to D6, D3 to D7, D6 to D10, D7 to D1 (D7, D8,D9, D10, D1), and D8 to D2 (D8, D9, D10, D1, D2) as a calculation range.Here, none of L1 to L6 have the same calculation range, and all includeat least one data disk common to at least one other local parity as acalculation range.

In the third embodiment, local parities are classified into a pluralityof groups. The classification of local parities into groups is performedso that all data disks respectively included in calculation ranges ofthe local parities within the same group may differ. In other words, thelocal parities are classified into the plurality of groups so that anarbitrary pair of local parities within the group does not include thesame data disk in a calculation range. Moreover, the local parities areclassified so that each of the data disks may be included in any of thecalculation ranges of the local parities included in each of the groups.

In the example illustrated in FIG. 8, the local parities are classifiedinto three groups so that L1 and L4, L2 and L5, and L3 and L6 arerespectively included in the same group.

Data disks included in the calculation range of L1 are D1 to D5, whereasthose included in the calculation range of L4 are D6 to D10.Accordingly, in the group of L1 and L4, all the data disks included ineach calculation ranges of the local parities are different. In otherwords, the data disks included in the calculation ranges of L1 and L4are not redundant. Moreover, each of the data disks D1 to D10 thatconfigure the disk array is included in either of the calculation rangesof L1 and L4.

In the example of FIG. 7 referred to in the second embodiment, thenumber of local parities that include D2 and D3 in the calculation rangeis 4, whereas the number of local parities that include D7 and D8 in thecalculation range is 2. This means that a data loss rate variesdepending on a data disk. For example, when a single fault occurs, thedata disk D2 can be recovered unless a fault occurs in all of the fourlocal parities L1, L2, L5 and L6. However, the data disk D7 cannot berecovered if a fault occurs in the local parities L3 and L4. In theexample illustrated in FIG. 7, the minimum Hamming distance is 3,deteriorating an evaluation of reliability.

In comparison with this, in the example of the third embodimentillustrated in FIG. 8, all the data disks are included in thecalculation ranges of a number of the local parities equal to that ofthe groups. Namely, data loss rates of all of D1 to D10 are equal. Forexample, when a single fault occurs, the data disks D1 to D10 can berecovered unless a fault occurs in the three local parities whichinclude each of the data disks in the calculation range, and the minimumHamming distance is 4. Accordingly, the evaluation of reliability isimproved in comparison with the example illustrated in FIG. 7.

Fourth Embodiment

In a fourth embodiment, calculation ranges of all local paritiesincluded in a disk array are configured to be shifted by an equalinterval. Namely, initial data disks of the calculation ranges of allthe local parities are configured to be shifted by (the number of datadisks)/(the number of local parities). When k can be divided by p,starting positions of the calculation ranges of the local parities canbe arranged at equal intervals on the data disks. Otherwise, thestarting positions are arranged at almost equal intervals by using aninteger before or after a value calculated by dividing k by p. Forexample, when k=10 and p=6, k cannot be divided by p (k/p=1.67).Therefore, an interval between the starting positions is set to 1 or 2.

A configuration of an information processing system according to thefourth embodiment is the same as that according to the third embodimentexcept that calculation ranges of all local parities included in a diskarray are configured to be shifted by an equal interval.

FIG. 9 illustrates an example of calculation ranges of local parities inthe fourth embodiment. FIG. 9 illustrates an example of a disk arrayincluding 10 data disks and six local parities. The data disks and thelocal parities are indicated by D1 to D10 and L1 to L6, respectively.Assume that identification numbers D1 to D10 are respectively assignedin ascending order to a storage order of data stripes of the data disks2.

The local parity L1 includes the five data disks D1 to D5 as acalculation range. Moreover, the local parities L2, L3, L4, L5 and L6respectively include the data disks D3 to D7, D5 to D9, D6 to D10, D8 toD2 (D8, D9, D10, D1, D2), and D10 to D4 (D10, D1, D2, D3, D4) as acalculation range. Here, none of the local parities L1 to L6 have thesame calculation range, and all include at least one data disk common toat least another local parity as a calculation range, similarly to thefirst embodiment.

In the example illustrated in FIG. 9, (the number of data disks)/(thenumber of local parities)=10/6=1.67. Therefore, calculation rangestarting positions of the local parities are shifted by 1 or 2. Thismakes it easier to select local parities having close calculation rangestarting positions when a plurality of data disks are recovered. As aresult, the number of data disks included in a calculation range of alocal parity decreases, so that a recovery overhead can be reduced.

A recovery method used when a fault occurs in a disk is similar to thatin the first embodiment. However, for instance, when a fault occurs inD3 and D8 in the example illustrated in FIG. 9, L2 and L3 can beselected as a combination of local parities used to recover the datadisks. When the data of D3 and D8 are recovered by using L2 and L3, datato be transferred for the recovery are data of six data disks D4, D5,D6, D7, L2 and L3. Accordingly, a recovery overhead in this case is 6.

In the meantime, in the case of the example of FIG. 8 in the thirdembodiment, when a fault occurs in D3 and D8, L3 and L4 can be selectedas a combination of local parities used to recover the data disks. Whenthe data of D3 and D8 are recovered by using L3 and L4, data to betransferred for the recovery are data of the eight disks D4, D5, D6, D7,D9, D10 L2 and L3. Accordingly, a recovery overhead in this case is 8.Therefore, according to the fourth embodiment, a recovery overhead canbe reduced in comparison with the third embodiment.

As described above, the fourth embodiment is configured so thatcalculation range starting positions of local parities are shifted by anequal interval, whereby a recovery overhead can be reduced.

Fifth Embodiment

When the number of faulty disks is smaller than that of local parities,arbitrariness occurs in a combination of local parities used for arecovery. In a fifth embodiment, a combination is selected so that a sumof the sets of data disks included in calculation ranges of all localparities included in a combination may be minimized in a selection of acombination of local parities used for the recovery.

For instance, a case where a fault occurs in the two data disks D1 andD2 in the example illustrated in FIG. 9 is considered. In this case,there are three combinations of local parities which satisfy theselection conditions, (L1, L5), (L1, L6) and (L5, L6). By using any ofthese three combinations, the two data disks D1 and D2 can be recovered.

In the fifth embodiment, the combination of local parities used for therecovery is selected so that a sum of the sets of data disks included incalculation ranges of all local parities included in a combination amongcombinations of local parities which satisfy the selection conditionsmay be minimized. In the example illustrated in FIG. 9, in the case ofthe combination (L1, L5), a sum of the sets of data disks included inthe calculation ranges of L1 and L5 is 8 disks {D1, D2, D3, D4, D5, D8,D9, D10}. In the case of the combination (L1, L6), a sum of the sets ofdata disks included in the calculation ranges of L1 and L6 is 6 datadisks {D1, D2, D3, D4, D5, D10}. In the case of the combination (L5,L6), a sum of the sets of data disks included in the calculation rangesof L5 and L6 is 7 data disks {D1, D2, D3, D4, D8, D9, D10}. Accordingly,a combination that minimizes a sum of the sets of data disks included incalculation ranges of local parities is (L1, L6). Accordingly, therecovering unit 32 selects the combination of (L1, L6) as thecombination of local parities used for the recovery.

In the following description, a sum of the sets of data disks includedin calculation ranges of all local parities included in a combination issometimes referred to as a sum of the sets of calculation ranges of acombination.

Operations of a process executed when a disk fault occurs in the fifthembodiment are described next with reference to FIGS. 10 and 11. FIG. 10is the first half of a flowchart illustrating the process in the fifthembodiment which is executed when a disk fault occurs. FIG. 11 is thelatter half of the flowchart illustrating the process in the fifthembodiment which is executed when the disk fault occurs.

In FIG. 10, the recovering unit 32 initially determines whether or not alocal parity is included in faulty disks (S201). If the local parity isincluded in the faulty disks (“YES” in S201), the recovering unit 32excludes the faulty local parity from a set of local parities used for arecovery (S202). Then, the process proceeds to S203.

If the local parity is not included in the faulty disks in S201 (“NO” inS201), the recovering unit 32 selects one of the combinations of localparities, the number of which is equal to that of the faulty data disks(S203). Here, assume that the combination selected by the recoveringunit 32 is a combination that has not been selected yet by therecovering unit 32 in S203.

Next, the recovering unit 32 determines whether or not the combinationselected in S203 is present (S204).

If the recovering unit 32 determines that the combination is present(“YES” in S204), the recovering unit 32 further determines whether ornot a local parity including no faulty data disks in a calculation rangeis present in the combination selected in S203 (S205). If the recoveringunit 32 determines that the local parity including no faulty data disksin the calculation range is present (“YES” in S204), the process returnsto S203.

In the meantime, if the recovering unit 32 determines that the localparity including no data disks in the calculation range is not present(“NO” in S205), the recovering unit 32 further determines whether or notthe faulty data disk is included in none of the calculation ranges ofthe local parities in the combination selected in S203 (S206). If therecovering unit 32 determines that the faulty data disk is included innone of the local parities in the selected combination (“NO” in S206),the process returns to S203, in which a new combination of localparities is selected.

In the meantime, if the recovering unit 32 determines that all thefaulty data disks are included in any of the calculation ranges of thelocal parities in the combination selected in S203 (“YES” in S206), therecovering unit 32 makes a comparison between a combination candidateand a number of a sum of the sets of calculation ranges (S207). Here,the combination candidate is a candidate set in S208. Specifically, ifthe combination candidate is present, the recovering unit 32 determineswhether or not the sum of the sets of calculation ranges of thecombination most recently selected in S203 is larger than a sum of thesets of calculation ranges of the combination candidate. If therecovering unit 32 determines that the sum of the sets of thecalculation ranges of the combination most recently selected in S203 islarger than the sum of the sets of calculation ranges of the combinationcandidate (“YES” in S207), the process returns to S203, in which a newcombination of local parities is selected.

In the meantime, if the recovering unit 32 determines that the sum ofthe sets of calculation ranges of the combination most recently selectedin S203 is not larger than the sum of the sets of calculation ranges ofthe combination candidate (“NO” in S207), the recovering unit 32 resetsthe combination most recently selected in S203 as a combinationcandidate (S208). Then, the process returns to S203, in which a newcombination of local parities is selected.

If the recovering unit 32 determines that the combination is not presentin S204 (“NO” in S204), the recovering unit 32 further determineswhether or not a combination candidate is present (S209) (see FIG. 11).Here, the combination candidate is the combination set in S208. If therecovering unit 32 determines that the combination candidate is present(“YES” in S209), the recovering unit 32 recovers the data disks bysolving simultaneous linear equations with the use of the combinationcandidate (S210). Specifically, the recovering unit 32 obtains data oflocal parities of the combination candidate, and those of data disksincluded in the calculation ranges of the local parities of thecombination candidate. Note that, however, the faulty data disks are notregarded as targets to be obtained. Then, the recovering unit 32recovers the data of the faulty data disks by solving simultaneouslinear equations with the use of the parity calculation expression byusing the obtained data of the local parities and those of the datadisks included in the calculation ranges of the local parities. Then,the process is terminated normally.

If the recovering unit 32 determines that the combination candidate isnot present in S209 (“NO” in S209), the faulty data disks cannot berecovered. Therefore, the process is terminated abnormally.

Comparison examples 1 and 2 are described next to help understanding ofadvantages of the first to the fifth embodiments.

FIG. 12 illustrates an example of a data configuration of the comparisonexample 1. In a distributed storage of the comparison example 1, parityinformation is calculated by using data of all disks to be maderedundant. Namely, in FIG. 12, parity disks (global parities) G1 to G6are configured by including all data disks in a calculation range.

FIG. 13 illustrates an example of a data configuration of the comparisonexample 2. In a distributed storage of the comparison example 2, parityinformation is calculated by using some of the local parities. Namely,in the example illustrated in FIG. 13, parity disks (global parities) G1to G4 are configured by including all data disks in a calculation range,and local parities L1 and L2 are configured by including mutuallyexclusive data disks in a calculation range. Here, the local parities L1and L2 do not include a data disk common to each other in thecalculation ranges in the comparison example 2.

FIG. 14 illustrates a performance comparison between the fifthembodiment and the comparison examples. In FIG. 14, a record of thecomparison example 1 (10, 6) indicates performance information of thecomparison example 1 where 10 data disks and six parity disks areincluded, as in the configuration illustrated in FIG. 12. A record ofthe comparison example 2 (10, 6, 5) indicates performance information ofthe comparison example 2 where 10 data disks and six parity disks areincluded and the number of data disks included in the calculation rangesof the local parities is 5, as in the configuration illustrated in FIG.13. A record of the fifth embodiment (10, 6, 5) indicates performanceinformation of the fifth embodiment where 10 data disks and six paritydisks are included and the number of data disks included in thecalculation ranges of the local parities is 5. Assume that startingpositions of the calculation ranges of the parity disks in the fifthembodiment (10, 6, 5) are distributed at equal intervals as in thefourth embodiment. A record of a replication (10×3) indicatesperformance information of the information processing system having aconfiguration implemented by replicating (duplicating) data diskstriply.

A capacity efficiency indicates an actually available disk capacityobtained by subtracting redundant data from capacities of all disks usedin a disk array. Capacity efficiencies in the comparison example 1 (10,6), the comparison example 2 (10, 6, 5), the fifth embodiment (10, 6,5), and the replication (10×3) are 0.625, 0.625, 0.625 and 0.33,respectively.

A minimum Hamming distance of data disks indicates a value of theminimum Hamming distance in a case where a fault occurs only in datadisks. Minimum Hamming distances of data disks in the comparison example1 (10, 6), the comparison example 2 (10, 6, 5), and the fifth embodiment(10, 6, 5) are 7, 7 and 7, respectively.

A minimum Hamming distance of all disks indicates a value of the minimumHamming distance targeting a case where a fault occurs in all disksincluding parity disks (local parities). Minimum Hamming distances ofall the disks in the comparison example 1 (10, 6), the comparisonexample 2 (10, 6, 5), the fifth embodiment (10, 6, 5), and thereplication (10×3) are 7, 6, 4 and 3, respectively.

An overhead of a write process indicates a value of the number ofinputs/outputs (the number of read/write processes) when a write processoccurs in one block of a data disk. In the case of the comparisonexample 1 (10, 6), reads/writes of data from/to a block desired to beupdated and those of data to/from all the parity disks G1 to G6 occur.Therefore, the overhead of the write process is 14. In the case of thecomparison example 2 (10, 6, 5), reads/writes of data from/to a blockdesired to be updated, those of data from/to the five global parities G1to G4, and those of data from/to either of L1 and L2 occur. Therefore,the overhead of the write process is 12. In the case of the fifthembodiment (10, 6, 5), reads/writes of data from/to a block desired tobe updated and those of data from/to any three of the local parities L1to L6 occur. Therefore, the overhead of the write process is 8. In thecase of the replication (10×3), writes of data to all data disks occur.Therefore, the overhead of the write process is 3. Accordingly, comparedwith the comparison examples 1 and 2, the write overhead can be reducedin the fifth embodiment.

A recovery overhead at the time of a single fault indicates a value of arecovery overhead when a single fault occurs. Here, the recoveryoverhead is a data transfer amount of data received by a managementserver that manages a faulty disk at the time of a recovery. In the caseof the comparison example 1 (10, 6), a read from any of the parity disksand that from nine data disks that are not faulty occur. Therefore, therecovery overhead at the time of the single fault is 10. In the case ofthe comparison example 2 (10, 6, 5), a read from any of the localparities and that from four data disks that are included in thecalculation range of the local parity and are not faulty occur.Therefore, the recovery overhead at the time of the single fault is 5.In the case of the fifth embodiment (10, 6, 5), a read from any of thelocal parities including a faulty data disk in a calculation range andthat from four data disks that are included in the local parity and arenot faulty occur. Therefore, the recovery overhead at the time of thesingle fault is 5. In the case of the replication (10×3), a read from anormal disk occurs. Therefore, the recovery overhead at the time of thesingle fault is 1. Accordingly, compared with the comparison example 1,the recovery overhead at the time of the single fault can be reduced inthe fifth embodiment.

Similarly, values of a recovery overhead at the time of a double faultin the comparison example 1 (10, 6), the comparison example 2 (10, 6,5), the fifth embodiment (10, 6, 5), and the replication (10×3) are 11,10 or 11, 7 or 8 (7.36 on average), and 2, respectively. Accordingly,compared with the comparison examples 1 and 2, the recovery overhead atthe time of the double fault can be reduced in the fifth embodiment.

Similarly, values of a recovery overhead at the time of a triple faultin the comparison example 1 (10, 6), the comparison example 2 (10, 6,5), and the fifth embodiment (10, 6, 5) are 12, 12, and 9, respectively.Note that data cannot be recovered in the replication (10×3) at the timeof the triple fault. Accordingly, compared with the comparison examples1 and 2, the recovery overhead at the time of the triple fault can bereduced in the fifth embodiment.

An overhead of a scrub process is a data transfer amount that occurs ina process for checking data consistency (completeness). Overheads of thescrub process in the comparison example 1 (10, 6), the comparisonexample 2 (10, 6, 5), the fifth embodiment (10, 6, 5), and thereplication (10×3) are 15, 15, 15 and 20, respectively.

As illustrated in FIG. 14, the fifth embodiment does not have the worstvalue in any of the entries, and is well-balanced.

The embodiment is not limited to the above described embodiments, andcan take various configurations or embodiments within a scope that doesnot depart from the gist of the present invention.

All examples and conditional language provided herein are intended forpedagogical purposes of aiding the reader in understanding the inventionand the concepts contributed by the inventor to further the art, and arenot to be construed as limitations to such specifically recited examplesand conditions, nor does the organization of such examples in thespecification relate to a showing of the superiority and inferiority ofthe invention. Although one or more embodiments of the present inventionhave been described in detail, it should be understood that the variouschanges, substitutions, and alterations could be made hereto withoutdeparting from the spirit and scope of the invention.

What is claimed is:
 1. A storage system comprising: a plurality of datadisks that store information; and a parity disk that corresponds to adisk group including some of the plurality of data disks, and storesparity information generated on the basis of data of the data disksincluded in the corresponding disk group, wherein any of the data disksis included in a plurality of the disk groups.
 2. The storage systemaccording to claim 1, wherein each of the plurality of data disks isincluded in any of the disk groups.
 3. The storage system according toclaim 1, wherein the number of data disks included in each of the diskgroups is equal in the disk groups.
 4. The storage system according toclaim 1, wherein a plurality of parity disks are classified into aplurality of parity groups so that an arbitrary pair of parity diskswithin each of the parity groups does not include the same data disk inthe corresponding disk group, and each of the plurality of data disks isincluded in any of the disk groups corresponding to the parity diskincluded in each of the parity groups.
 5. The storage system accordingto claim 1, wherein identification numbers are made to correspondrespectively in ascending order to a storage order of data stripes ofthe data disks, and a minimum value of the identification numbers of thedata disks included in the disk groups each corresponding to the paritydisk is shifted by a number calculated by dividing the number of datadisks by the number of parity disks.
 6. The storage system according toclaim 1, the storage system further comprising a recovering unitconfigured to recover data of a faulty data disk by using the paritydisk corresponding to the disk group including the faulty data disk whena fault occurs in one or more data disks.
 7. The storage systemaccording to claim 6, wherein the recovering unit selects a number ofparity disks which is equal to the number of the faulty data disks sothat all the faulty data disks are included in the disk groupcorresponding to any of the parity disks to be selected and one or morethe faulty data disks are included in the disk groups corresponding toall the parity disks to be selected, and the recovering unit recoversthe data of the faulty data disks on the basis of the selected paritydisks.
 8. The storage system according to claim 7, wherein therecovering unit selects a number of parity disks which is equal to thenumber of the faulty data disks so that all the faulty data disks areincluded in the disk group corresponding to any of the parity disks tobe selected, the one or more faulty data disks are included in the diskgroups corresponding to all the parity disks to be selected, and a sumof the sets of the data disks included in the disk groups correspondingto the parity disks to be selected is minimized, and the recovering unitrecovers the data of the faulty data disks on the basis of the selectedparity disks.
 9. A non-transitory computer-readable recording mediumhaving stored therein a program for causing a computer to execute aprocess, the process comprising: recovering data of a faulty data diskby using a parity disk corresponding to a disk group including thefaulty data disk when a fault occurs in one or more data disks in astorage system including a plurality of data disks that storeinformation, and a parity disk that corresponds to a disk groupincluding some of the plurality of data disks and that stores parityinformation generated on the basis of data of the data disks included inthe corresponding disk group, wherein any of the data disks is includedin the plurality of disk groups in the storage system.